Approximate Lesion Localization in Dermoscopy Images 
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Abstract 

Background: Dermoscopy is one of the major imaging modalities used in the diagnosis of melanoma and other pigmented 
skin lesions. Due to the difficulty and subjectivity of human interpretation, automated analysis of dermoscopy images has 
become an important research area. Border detection is often the first step in this analysis. Methods: In this article, we 
present an approximate lesion localization method that serves as a preprocessing step for detecting borders in dermoscopy 
images. In this method, first the black frame around the image is removed using an iterative algorithm. The approximate 
location of the lesion is then determined using an ensemble of thresholding algorithms. Results: The method is tested on 
a set of 428 dermoscopy images. The localization error is quantified by a metric that uses dermatologist determined borders 
as the ground truth. Conclusion: The results demonstrate that the method presented here achieves both fast and accurate 
localization of lesions in dermoscopy images. 



Introduction 



Jllalignant melanoma, the most deadly form of skin cancer, is one of the most rapidly increasing cancers in the world, with an 
^g stimated incidence of 62,480 and an estimated total of 8,420 deaths in the United States in 2008 alone pQ. Early diagnosis is 
^p articularly important since melanoma can be cured with a simple excision if detected early. 

. Dermoscopy, also known as epiluminescence microscopy, has become one of the most important tools in the diagnosis of 
^ifelanoma and other pigmented skin lesions. This non-invasive skin imaging technique involves optical magnification, which 
•^aakes subsurface structures more easily visible when compared to conventional clinical images [2]. This in turn reduces 
^^creening errors and provides greater differentiation between difficult lesions such as pigmented Spitz nevi and small, clinically 
^quivocal lesions [3]. However, it has also been demonstrated that dermoscopy may actually lower the diagnostic accuracy 
" rn'the hands of inexperienced dermatologists [4... Therefore, in order to minimize the diagnostic errors that result from the 
difficulty and subjectivity of visual interpretation, the development of computerized image analysis techniques is of paramount 
importance [5]. 

Automated border detection is often the first step in the automated or semi-automated analysis of dermoscopy images 
[6j [7J [8j [9j [TUJ [TTJ [12] . It is crucial for the image analysis for two main reasons. First, the border structure provides important 
information for accurate diagnosis, as many clinical features, such as asymmetry, border irregularity, and abrupt border cutoff, 
are calculated directly from the border. Second, the extraction of other important clinical features such as atypical pigment 
networks, globules, and blue-white areas, critically depends on the accuracy of border detection. Automated border detection 
is a challenging task due to several reasons: (i) low contrast between the lesion and the surrounding skin, (ii) irregular and 
fuzzy lesion borders, (iii) artifacts such as black frames, skin lines, hairs, and air bubbles, (iv) variegated coloring inside the 
lesion. 

A number of methods have been developed for preprocessing dermoscopy images. Most of these focused on the removal of 
artifacts such as hairs and bubbles. Of the studies dealing with hair removal, Lee et al. [13] and Schmid [6] approached the 
problem using mathematical morphology. Fleming et al. [5] applied curvilinear structure detection with various constraints 
followed by gap filling. More recently, Zhou et al. [14] improved Fleming et aZ.'s approach using feature guided examplar-based 
inpainting. A method for bubble removal was introduced in [5], where the authors utilized a morphological top-hat operator 
followed by a radial search procedure. 

In this article, we present a method for approximate lesion localization in dermoscopy images. First, the black frame 
around the image is removed using an iterative algorithm. Then, the approximate location of the lesion is determined using 
an ensemble of thresholding algorithms. 
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2 Materials and Methods 



2.1 Black Frame Removal 

Dermoscopy images often contain black frames that are introduced during the digitization process. These need to be removed 
because they might interfere with the subsequent lesion localization procedure. In order to determine the darkness of a pixel 
with (R, G, B) coordinates, the lightness component of the HSL color space [15] is utilized: 



L 



max(i?, G, B) + min(i?, G, B) 



(1) 



In particular, a pixel is considered to be black if its lightness value is less than 20. Using this criterion, the image is scanned 
row-by-row starting from the top. A particular row is labeled as part of the black frame if it contains 60% black pixels. The 
top-to-bottom scan terminates when a row that contains less than the threshold percentage of pixels is encountered. The same 
scanning procedure is repeated for the other three main directions. Fig. [T] shows the result of this procedure on a sample image. 





(a) Original image (b) After frame removal 

Figure 1: Black frame removal 



2.2 Approximate Lesion Localization 

Although dermoscopy images can be quite large, the actual lesion often occupies a relatively small area. Therefore, if we can 
determine the approximate location of the lesion, the border detection algorithm can focus on this region rather than the 
whole image. An accurate bounding box (the smallest axis- aligned rectangular box that encloses the lesion) might be useful 
for various reasons: (i) it provides an estimate of the lesion size (certain image segmentation algorithms such as region growing 
and morphological flooding can use the size of the region as a termination criterion), (ii) it might improve the border detection 
accuracy since the procedure is focused on a region that is guaranteed to contain the lesion, (iii) it speeds up the border 
detection since the procedure is performed on a region that is often smaller than the whole image, (iv) its surrounding might 
be utilized in the estimation of the background skin color, which is useful for various operations including the elimination of 
spurious regions that are discovered during the border detection procedure [10] and the extraction of dermoscopic features such 
as blotches [16] and blue- white areas [T7] . 

In many dermoscopic images, the lesion can be roughly separated from the background skin using a grayscale thresholding 
method applied to the blue channel [8, 9 . While there are a number of thresholding methods that perform well in general, 
the effectiveness of a method strongly depends on the statistical characteristics of the image [18]. Fig. [2] illustrates this 
phenomenon^. Here, methods |2(d)| |2(e) , and 2(g)| perform quite well. In contrast, methods 2(c) and |2(h)| underestimate 
the optimal threshold, whereas method 2(f) overestimates the optimal threshold. Although method 2(c)| is the most popular 
thresholding algorithm in the literature [25], for this particular image, it performs the second worst. 

A possible approach to overcome this problem is to fuse the results provided by an ensemble of thresholding algorithms. 
In this way, it is possible to exploit the peculiarities of the participating thresholding algorithms synergistically, thus arriving 
at more robust final decisions than is possible with a single thresholding algorithm. We note that the goal of the fusion is not 
to outperform the individual thresholding algorithms, but to obtain accuracies comparable to that of the best thresholding 
algorithm independently of the image characteristics. In this study, we used the threshold fusion method proposed by Melgani 
[18], which we describe briefly in the following. 

Let X = {x mn :m = 0, 1,...,M — 1, n = 0, 1, . . . , TV — l}be the original scalar M x N image with L possible gray levels 
(x mn G {0, 1, . . . , L — 1}) and Y = {y mn : m = 0, 1, . . . , M — 1, n = 0, 1, . . . , N — 1} be the binary output of the threshold 
fusion. Consider an ensemble of P thresholding algorithms. Let Ti and (i = 1,2, ... , P) be the threshold value and the 



1 The frame of this image is left intact for visualization purposes. 
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(a) Original image (b) Blue channel 




(c) Otsu's method [19] (T = 137) (d) Kapur et aUs method [20] (T = 178) 




(g) Sahoo et aVs method 23 (T = 179) (h) Li & Tarn's method [24] (T = 59) 



Figure 2: Comparison of various thresholding methods (T: threshold) 

output binary image associated with the i-th algorithm of the ensemble, respectively. Within a Markov Random Field (MRF) 
framework the fusion problem can be formulated as an energy minimization task. Accordingly, the local energy function U mn 
to be minimized for the pixel (m, n) can be written as follows: 

p 

U mn = p S p ■ Usp [ymn,Y s (m,n)] + ^ ft • Uj/ [y mn , Af (m, n)] (2) 

where 5 is a predefined neighborhood system associated with pixel (m, n), Us'p(-) and Ujj(-) refer to the spatial and inter- 
image energy functions, respectively, whereas fisp and ft (i = 1,2,...,P) represent the spatial and inter-image parameters, 
respectively. The spatial energy function can be expressed as: 

U<?P [^ mn ,F 5 (m,n)] = - ^ l (yrnn,y P q) (3) 

y pq eY s (m,n) 
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where !(.,.) is the indicator function defined as: 



1 if Urnn — — ypq 

I(ymn,y Pq ) = I otll e r wise ^ (4) 
The inter-image energy function is defined as: 

U/j [y mn , A s (m, n)]=- ^ a l (x pq ) • I [y mn , Ai(p, q)} (5) 

Ai(p,q)eAf (m,n) 

where a % {-) is a weight function given by: 

c^{x mn ) = 1 -exp(-7|x mn -T^l) (6) 

This function controls the effect of unreliable decisions at the pixel level that can be incurred by the thresholding algorithms. 
At the global (image) level decisions are weighed by the inter-image parameters (3i (i = 1,2, ... ,P), which are computed as 
follows: 

A=exp(- 7 |f-Ti|) (7) 

where T is the average threshold value: 

p 



? =iE r < ( 8 ) 



p 

1=1 

The MRF fusion strategy proposed in [18] is as follows: 

1. Apply each thresholding algorithm of the ensemble to the image X to generate the set of thresholded images A{ (i = 
1,2,. ..,P) 

2. Initialize Y by minimizing for each pixel (m, n) the local energy function U mn defined in Eq. [2] without the spatial energy 
term i.e., by setting (3s p = 0. 

3. Update Y by minimizing for each pixel (m, n) the local energy function U mn defined in Eq. [2] including the spatial energy 
term i.e., by setting (3s p ^ 0. 

4. Repeat step 3 K max times or until the number of different labels in Y computed over the last two iterations becomes 
very small. 

In our preliminary experiments, we observed that, besides being computationally demanding, the iterative part (step 3) of 
the fusion algorithm makes only marginal contribution to the quality of the results. Therefore, in this study, we considered 
only the first two steps. The 7 parameter was set to the recommended value of 0.1 [18J. For computational reasons, a (Eq. [6]) 
and (3 (Eq. [7j) values were precalculated and the neighborhood system S was chosen as a 3 x 3 square. 

The most important performance factor in the fusion algorithm seems to be the choice of the thresholding algorithms. We 
considered six popular thresholding algorithms to construct the ensemble: Otsu's [19] . Kapur et a/.'s [20], Huang & Wang's 
[2T] . Yen et a/.'s [22J , Sahoo et a/.'s [23J, and Li & Tarn's [24] methods. In order to determine the best combination, we 
evaluated ensembles with 3 (20 ensembles), 4 (15 ensembles), 5 (6 ensembles), and 6 (1 ensembles) methods. 

Fig. [3] shows the output of four particular ensembles: Otsu-Kapur-Huang, Yen-Sahoo-Li, Otsu-Kapur-Huang-Yen, and 
Huang- Yen-Sahoo-Li. Note that each ensemble contains at least one method that either underestimates or overestimates the 
optimal threshold. It can be seen that each ensemble performs equally well, which demonstrates that failures in pathological 
cases might be prevented using a proper fusion strategy. 

Fig. 4(a)| shows the result of the ensemble Otsu-Kapur-Huang-Sahoo. Here, the blue bounding box encloses the dermatologist 



determined border (see Section [3]), whereas the red one encloses the binary output of the threshold fusion. It can be seen that 
the red box is completely contained within the blue box. This was observed in many cases because the automated thresholding 
methods tend to find the sharpest pigment change, whereas the dermatologists choose the outmost detectable pigment. We 
experimented with two different expansion methods to solve this problem. The first one involves expanding the automatic box 
by P% in four main directions. In other words, an automatic box of size Mb x Nb is expanded by Mb • P/100 pixels in the 
West and East directions and Nb • P/100 pixels in the North and South directions. The second one involves incrementing 
the threshold values obtained by each algorithm in the ensemble by G gray levels. In the rest of this article, we will refer to 
these expansion methods as non-adaptive and adaptive, respectively. Figs. 4(a)| and |4(b)| show the results of these methods 



with the expanded box shown in green. In this particular example, the non-adaptive method performs better in bringing the 
automatic box closer to the manual one. In order to determine the optimal expansion amounts we evaluated P G {2,4,6,8} 
and G G {4,6,8,10}. 
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(c) Otsu-Kapur-Huang-Yen (d) Huang- Yen- Sahoo-Li 

Figure 3: Comparison of various threshold ensembles 





(a) Non-adaptive expansion with P = 3% (b) Adaptive expansion with G = 6 

Figure 4: Comparison of the bounding box expansion methods 



3 Results and Discussion 

The proposed method was tested on a set of 428 dermoscopy images obtained from the EDRA Interactive Atlas of Dermoscopy 
[2] and the Keio University Hospital. These were 24-bit RGB color images with dimensions ranging from 771 x 507 pixels to 
768 x 512 pixels. An experienced dermatologist (WVS) determined the manual borders by selecting a number of points on the 
lesion border, which were then connected by a second-order B-spline. The bounding box error was quantified using the grading 
system developed by Hance et al. [26] 



Area( AutomaticB ox © ManualBox) 
Ar ea,(ManualB ox) 



(9) 



where AutomaticB ox is the binary image obtained by filling the bounding box of the fusion output, ManualBox is the 
binary image obtained by filling the bounding box of the dermatologist-determined border, © is the exclusive-OR operation, 
which essentially determines the pixels for which the AutomaticBox and ManualBox disagree, and Area(J) denotes the number 
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Table 1: Ensemble statistics (/i: mean, a: std. dev., ef. initial box error, e x : expanded box error) 



Ensemble 


Expansion Method 




a 6i 






Us 


cr s 


Otsu-Kapur-Huang-Sahoo 


Non-adaptive (P = 2) 


10.25 


8.10 


7.58 


8.13 


268.31 


185.64 


Otsu-Huang- Yen-Li 


Non-adaptive (P = 4) 


11.92 


7.59 


7.89 


6.30 


260.55 


183.85 


Otsu-Huang-Sahoo-Li 


Non-adaptive (P = 4) 


11.98 


7.62 


7.90 


6.20 


260.95 


184.14 


Otsu-Huang-Sahoo 


Non-adaptive (P = 2) 


11.14 


7.17 


7.91 


6.71 


273.84 


195.69 


Otsu-Kapur-Huang-Sahoo 


Adaptive (G = 6) 


10.25 


8.10 


9.27 


7.68 


276.92 


192.14 


Kapur-Huang-Sahoo-Li 


Adaptive (G = 8) 


10.98 


7.66 


9.43 


7.69 


279.03 


194.42 


Otsu-Kapur-Huang-Sahoo 


Adaptive (G = 4) 


10.25 


8.10 


9.44 


7.56 


279.98 


194.26 


Kapur-Huang-Sahoo-Li 


Adaptive (G = 6) 


10.98 


7.66 


9.67 


7.58 


282.09 


196.58 



Table 2: Individual statistics (fi: mean, a: std. dev., ef. initial box error, e x : expanded box error) 



Thresholding Method 


Expansion Method 








°~e x 


Ms 


OS 


Otsu 


Non-adaptive (P = 2) 


12.05 


9.10 


9.00 


8.95 


275.07 


199.28 


Kapur 


Non-adaptive (P = 2) 


12.87 


16.86 


12.68 


17.56 


261.95 


197.94 


Huang 


Non-adaptive (P = 2) 


20.31 


67.97 


17.17 


69.76 


269.59 


190.09 


Yen 


Non-adaptive (P = 2) 


14.98 


27.12 


15.74 


27.74 


255.61 


250.53 


Sahoo 


Non-adaptive (P = 2) 


13.43 


24.60 


13.37 


25.19 


254.43 


184.36 


Li 


Non-adaptive (P = 2) 


15.12 


9.65 


11.06 


9.07 


293.54 


215.80 


Otsu 


Non-adaptive (P = 4) 


12.05 


9.10 


9.10 


9.14 


256.86 


182.82 


Kapur 


Non-adaptive (P = 4) 


12.87 


16.86 


15.54 


18.61 


245.36 


183.78 


Huang 


Non-adaptive (P = 4) 


20.31 


67.97 


16.83 


70.69 


251.99 


174.44 


Yen 


Non-adaptive (P = 4) 


14.98 


27.12 


19.32 


28.49 


239.46 


230.91 


Sahoo 


Non-adaptive (P = 4) 


13.43 


24.60 


16.43 


25.98 


238.32 


170.38 


Li 


Non-adaptive (P = 4) 


15.12 


9.65 


9.41 


7.99 


273.93 


198.61 



of pixels in the binary image /. 

We determined the optimal parameter combination for the presented approximate bounding box computation method as 
follows. First, the black frame removal procedure described in Section 12.11 is performed on each image in the data set. The 
lesion bounding box is then computed using the fusion method described in Section l2?2l with one of the 42 ensembles. Finally, 
the approximate bounding box is expanded using either the non-adaptive method with P G {2, 4, 6, 8} or the adaptive method 
with G G {4, 6, 8, 10}. Table Q] shows various statistics associated with the four most accurate ensembles for each expansion 
method. The last two columns refer to the mean and standard deviation values, respectively for the percentage image size 
reduction, i.e. ^ rea ( A u ^omaticBox ) provided by the bounding box computation. The following observations are in 

order: (i) both expansion methods reduce the mean bounding box error, (ii) the lowest mean errors were obtained using the 
ensemble Otsu-Kapur-Huang-Sahoo, (iii) the non-adaptive expansion method was more effective than the adaptive one, (iv) the 
computation of the bounding box reduced the original image size by about 260%. 

The adaptive method was less effective than the non-adaptive one probably because the former often expands the approxi- 
mate box by unpredictable amounts: either too little (as in Fig. |4(b)[ ) or too much depending on the shape of the histogram 
and the value of the G parameter. In contrast, the latter always expands the approximate box by an amount specified by the 
P parameter. 

Table [2] shows the statistics for the individual thresholding methods. Note that, due to space limitations, we report only 
the results of the non-adaptive expansion method (as in the ensemble case, the adaptive method has inferior performance). It 
can be seen that, in most configurations, the individual methods obtain significantly higher mean errors than the best ensemble 
methods, i.e. the first four rows of Table [TJ This is because, as explained in Section [2^2) the individual methods are more prone 
to catastrophic failures when given pathological input images. The high standard deviation values also support this explanation. 
Only the performance of Otsu (with P = 2, 4) and Li et aVs (with P = 4) methods is close to the performance of the ensembles. 
However, as mentioned in Section 12. 2\ the goal of fusion is not to outperform the individual thresholding algorithms, but to 
obtain accuracies comparable to that of the best thresholding algorithm independently of the image characteristics. 

As mentioned in Section 12^2] an accurate bounding box can provide an estimate of the lesion size, i.e. Ay e&(ManualB order). 
In order to verify this, we calculated the best fitting line for Area( AutomaticB ox) vs. Area,(ManualB order) using the generalized 
least-squares method [27] : 

Ar e&(ManualB order) « Area( AutomaticB ox) • 0.861271 - 15627.226419 (10) 

where ManualB order is the binary image obtained by filling the dermatologist-determined border. The accuracy of this relation 
was calculated by plugging the area of the approximate bounding box for each image into Eq. [lOj and then comparing the 
result with the actual area of the lesion, which is calculated from the dermatologist-determined border. The percentage mean 



Skin Research and Technology, 15(3): 314-322, 2009 



and standard deviation errors over the entire image set were 11.88 and 11.49, respectively. These results demonstrate that the 
lesion size can be estimated from the bounding box area with relatively high accuracy. An even better estimate can be made 
from the binary output of the threshold fusion. The best fitting line for Area>(FusionOutput) vs. Are&(ManualB order) was 
calculated as: 

Ar e&(ManualB order) w Areci(FusionOutput) • 1.158209 - 4485.287871 (11) 

where FusionOutput is the binary output of the threshold fusion. The percentage mean and standard deviation errors for this 
relation were 8.16 and 8.54, respectively. 

Fig. [5] shows sample bounding box computation results obtained using the ensemble Otsu-Kapur-Huang-Sahoo with P = 2. 
It can be seen that the presented method determines an accurate bounding box even for lesions with fuzzy borders. We note 
that while the expansion operation is useful in most cases, in some cases such as Fig. |5(d)| it might deteriorate the results 
slightly. 

4 Conclusions 

In this paper, an automated method for approximate lesion localization in dermoscopy images is presented. The method is 
comprised of three main phases: black frame removal, initial bounding box computation using an ensemble of thresholding 
algorithms, and expansion of the initial bounding box. The execution time of the method is about 0.15 seconds for a typical 
image of size 768 x 512 pixels on an Intel Pentium D 2.66Ghz computer. 

The presented method may not perform well on images with significant amount of hair or bubbles since these elements 
alter the histogram, which in turn results in biased threshold computations. For images with hair, a preprocessor such as 
DullRazor™ [13] might be helpful. Unfortunately, the development of a reliable bubble removal method remains an open 
problem. 

Future work will be directed towards testing the utility of the presented method in a border detection study. The imple- 
mentation of the threshold fusion method will be made publicly available as part of the Fourier image processing and analysis 
library, which can be downloaded from http://sourceforge.net/projects/fourier-ipal 
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